Lab Report 1: Google Fusion Tables
Azarmeen Dastoor

Google Fusion Tables: http://www.google.com/fusiontables
Gallery: https://sites.google.com/site/fusiontablestalks/stories

Overview
As mentioned in my seminar presentation, Google Fusion Tables is a free, alternative way to visualize locational datasets, unlike the constraints that are within the free version of Google Earth. The Fusion Tables API also allows users to make standard timelines, scatterplots, and pie, bar, line, and line graphs. Users can make simple or intricate visualizations using no programming if they wish to go this route. If an individual is importing data, the readable file formats should either be .csv, .tsv, .txt, .kml, or a Google Spreadsheet itself. Fusion Tables also calculates values automatically if a formula is made, just like in Microsoft Excel. The table columns are customizable according to the type of data. Users add or delete columns depending on the amount of information they want to convey for each location. Some additional columns a user might want to include are source of information and tags. The default columns are text, number, location, and date.

Intensity Mapping
The feature I find most useful in Fusion Tables is the intensity map visualization (see Figure 1 and Figure 2). It’s an easy way to convey demographical information without coding. Fusion Tables automatically takes the minimum and maximum values the user inputs to scale and colour code the map accordingly. In class I demonstrated how to pinpoint a location using Fusion Tables without coding: mouseover the location column and a Google Earth icon appears. Once the user clicks that, a new window appears where a state or country name can be entered into the search box. Click the placemark that appears and select “Use this location.” Alternatively, users can also use latitude and longitude to find a location. Initially I found this confusing since I kept finding the minutes and seconds coordinates for any place when I typed it in on the Internet. Going back to Fusion Tables for this lab and reading through the API manual properly, I found that the latitude and longitude must be written in a decimal format, and not in minutes and seconds. In order for Fusion Tables to recognize the location, either a space or a comma also has to be used to separate the latitude and longitude decimal values.

GMT_1.png
GMT_2.png
Figure 1: Standard map view showing extra information after the user clicks a marked city.
Figure 2: Intensity map view of Canada only. Notice that the city of Ottawa is not counted in this view, and the scale is automatically created by Fusion Tables.

My Experiment
Click here to view my experiment or click below to download my .csv file.


In the example I created, I used various methods of inputting my locational data by entering addresses, cities, and using latitude and longitude to see if Fusion Tables would still render my data. I wanted to plot where my friends are around the world according to the city they live in. I chose these friends strategically so I would have a minimum of more than five people per city to use for this dataset.

The columns I decided to use/modify were city name, location, number of friends, and top three friends within that city (I changed the names in the Fusion Tables dataset for exporting here). To change the name of the columns and sort them in the order I wanted, I went to Edit > Modify columns. Similarly, to add each row, I went to Edit > Add row. To check if the specified location has been successfully entered and marked, go to Visualize > Map. A great website that shows the latitude and longitude of any place around the world is Find Latitude and Longitude. This website also displays the coordinates when the user clicks an area on the interactive map – an extremely fast way of copy-pasting the coordinates into Fusion Tables. Once the Fusion Table has all its data listed, the user can export it as a (default) .csv file.

Opinions
Saving?
I find it very frustrating that there is no save option offered. While experimenting for this lab and for my seminar presentation, I would constantly create new tables because I had no idea where or why my previous ones disappeared. For some reason, any tables I made did not show up on my Google Fusion Tables homepage automatically. I didn’t know if everything I was typing was being saved as a draft, if being saved at all. There’s no indication in this API whatsoever. I decided to save the URLs of any future Fusion Tables I’ll work on (possibly for my final project in this class) as a method of recovery, just in case.
EDIT: Google Fusion Tables saved most of my work in Google Docs instead, which is odd considering Fusion Tables has its own homepage.

Location
I also mentioned before that I found the location column in Fusion Tables confusing. Without noticing the little Google Earth icon popping up, I would have thought I could just enter a location within the text field and Google Fusion Tables would recognize it – but that is not the case! The user has to select “Use this location” in order for the placemark to show up in the standard map or intensity map.

Comments
A component I liked about Fusion Tables was their comments feature (see Figure 3). If I collaborated on my final project with a partner, this would be a convenient way to discuss any issues or typos within that particular row of the dataset. A small, blue triangular arrow on the top right indicates comments added for that row.

GMT_3.png
Figure 3: A sample comment displayed.
Scaling and Colour
It was easy to figure out that I could determine the scale of the area I wanted to highlight. For instance, since I have many friends in Canada, I can choose to focus on this country in particular, rather than the world. This can be done in the intensity map by typing in “Canada” instead of the default “World” in the Area text field. Fusion Tables automatically highlights the entire area of the city according to its state borders. A downside is, as seen in Figure 2, if there are two entries (Ottawa and Toronto, for instance), only one of them will show up on mouseover. The other value is not visible. A formula must be used to calculate the total amount of friends I have in the province of Ontario, in this case. Something I have yet to figure out is how to change the colour of the intensity map. There is no user-friendly option available, so I assume it has to be done through programming.


Lab Report 2: HTML, CSS and JavaScript Stacked Bar Graph
Azarmeen Dastoor

Overview
After I solidified a topic for my final project, I needed to figure out which type of graph would best suit my dataset. I was browsing through some of the different types of graphs in Nathan Yau’s Visualize This and found that a stacked bar graph was most appropriate to show the type of data that I needed to visualize.

My Final Project and Choosing a Graph
Before I get into the reasoning behind choosing a stacked bar graph over other types of graphs, I want to briefly mention my final project and the type of data I’ll be working with. The primary message behind my visualization will be to show a relationship between the amount of campaign money raised for a single candidate and the outcome of an election. Ultimately, I’m trying to show money being translated into votes, and using this as a tool to predict the 2012 congressional elections for the state of Ohio based on the amount of money raised thus far.

Original_Site.png
Figure 1: A screenshot of what the original dataset looks like on the website. It shows the amount of money raised by candidates from the 2008 Ohio congressional election. As you can see, this bar graph is really pointless since some bars aren't even visible (it appears worse in some of the other election years). There needs to be another way to represent this data much more efficiently. This data was retrieved from OpenSecrets.
With the kind of information that is conveyed in the dataset above I needed a graph that compared two or more pieces of information within the same category. Essentially, I needed to show the staggering amount of money one candidate had (which usually ended up being the winner of the election), versus the money all the other candidates combined had in their campaign budgets. This is why I focused on chapter five of Nathan’s book since he discusses how we can visualize proportions differently depending on the type of data used. Early in the chapter Nathan states that in datasets with extreme values, such as in some of mine with very high amounts and staggeringly low amounts, it isn’t necessary to show the maximum and minimum values – instead, it is the distribution of proportions that will best convey the (winning candidate’s) supremacy and the (losing candidates’) inferiority (177). In order to plot this, Nathan lists the types of graphs that would be appropriate to use: a pie chart, donut chart, stacked bar graph, stacked area graph, and a tree map. These graphs all show one statistic in relation to another by juxtapositioning them.

For personal preference reasons I wanted to stay away from the standard pie chart simply because I think it’s overused. My final project uses a donut chart but has stacked bar graphs inside each cell of the donut chart to convey the numerical data. I chose the stacked bar graph for two reasons: first, it is easy to make in Illustrator and manipulate, and second, it allows for easy categorization. In my project I have two main categories: the year of the election and the number of the district. Instead of listing the individual campaign budgets of each candidate, I decided to combine all the losing candidates and separate the winner since his/her amounts were, for the most part, substantially higher than the rest. I did this by adding all the totals of each candidate in each district and dividing by the total budget of the winning candidate to get a percentage value in the end.

SBG_1.png

Figure 2: My stacked bar graph coded using HTML, CSS and JavaScript.
My Experiment

Click above to download a .zip file of my work. Please note that the "protovis" JavaScript file was entirely coded by Nathan Yau. If you would like to see how I coded this graph, open my HTML file and go to View > View Source.
Although my project won’t be Web-based, for purposes of this lab I’ve made an HTML, CSS and JavaScript stacked bar graph using Nathan’s example in the textbook as a guide. I made an interactive graph showing the percentage of votes in each zone of a fake town I made up. This emulates the type of dataset I’m using for my final project.

Learning the Basics
To make this graph, I first copy-pasted all the code Nathan wrote in his book under the stacked bar section. Basically, just like most programming languages, the coding ends up being like a sandwich – whatever is open up top ends up being closed on the bottom in reverse order:
  • (1) <Start HTML>
        • Lays out the basic framework of a webpage.
    • <JavaScript Import>
        • Link to Nathan’s "protovis" JavaScript file (make sure it is sourced properly; in the original script Nathan made a separate folder for it, but in my file I took that folder out and put everything into one folder).
    • (2) <Start CSS>
        • Handles the design and overall look of the title, graph and text captions.
    • (2) </End CSS>
    • (3) <Start JavaScript>
        • This makes the graph scale, allows it to be interactive and calculates the values necessary to produce a stacked bar graph.
    • (3) </End JavaScript>
  • (1) </End HTML>

JavaScriptI strongly suggest that anyone not modify the JavaScript code of the HTML file unless they are familiar with this language or have a good understanding of programming in general. Nathan’s file is very organized, neatly coded and commented to make it easy for us to modify, but there were only a few things I wasn’t hesitant in changing, such as the positioning of the bars, changing the colour of the bars, and changing the size of the graph. Otherwise, I left this section untouched. However, if someone wants to get rid of the default "%" that comes up on tooltip (mouseover on the bars), then the section they should modify is under "Stacked layout" where it says ".title(function(d) d + "%")." Anything in the quotation marks is what will be outputted.
An important thing to remember while inputting data into the JavaScript section is to keep the comma separated values (arrays) in their proper order. In my example I have Zones 1-5, which means I should have four commas to separate five different numbers in each array or row (winners row versus losers row). Every time I add a new row, the computer automatically creates a new ‘stack’ on top of the last set of bars. In my code I first put the winners row, followed by the losers. In the output, the winners are on the base of the graph while the losers take the higher position. Whatever is coded first will end up on the bottom.
CSSThe CSS part is where all the customization can happen. The names of the variables are straightforward and don’t cause any confusion. Even those who have less-than-basic CSS programming skills can modify this part since it’s so simple. I changed the original file’s serif font to a sans-serif font to make it more modern. I made the header font smaller and increased the width of the wrapper, even though it wasn’t necessary since I only have one descriptive line in my example. I deleted the spacing between each line of text that was originally there to move up the graph. Note that the labels on the right – i.e., "Winner’s votes" and "Opponent’s votes" – have to be positioned manually to the right; otherwise it messes up the layout of the graph.
This is an extremely uncomplicated graph to make. It requires limited knowledge of programming and would be great to use for Web-based information visualizations.

Works Cited

Yau, Nathan. Visualize This: The FlowingData Guide to Design, Visualization and Statistics. Wiley, 2011.